Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition

نویسندگان

  • Nobuaki Minematsu
  • Gakuto Kurata
  • Keikichi Hirose
چکیده

To recognize non-native speech, larger acoustic/linguistic distortions must be handled adequately in acoustic modeling, language modeling, lexical modeling, and/or decoding strategy. In this paper, a novel method to enhance MLLR adaptation of acoustic models for non-native speech recognition is proposed. In the case of native speech recognition, MLLR speaker adaptation was successfully introduced because it enables efficient adaptation with a small number of adaptation data by using a regression tree of Gaussian mixtures of HMMs. However, as for non-native speech, most of the cases, the regression tree built from the baseline HMMs does not match with pronunciation proficiency of a speaker. This paper provides a solution for this problem, where the speaker’s proficiency is automatically estimated and the tree suited for the proficiency is built, which can be viewed as proficiency adaptation. Recognition experiments show that MLLR with the new tree raises the averaged error reduction rate up to about 30 % from the baseline MLLR performance of approximately 20 %.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combined acoustic and pronunciation modelling for non-native speech recognition

In this paper, we present several adaptation methods for nonnative speech recognition. We have tested pronunciation modelling, MLLR and MAP non-native pronunciation adaptation and HMM models retraining on the HIWIRE foreign accented English speech database. The “phonetic confusion” scheme we have developed consists in associating to each spoken phone several sequences of confused phones. In our...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Is non-native pronunciation modelling necessary ?

It is difficult to recognize accented or non-native speech with speech recognition systems that are trained using native speech. While standard acoustic speaker adaptation techniques are often applied in these cases, they can only reduce the recognition errors that are due to mispronunciations on the phoneme level. They are not able to handle severe deviations from the expected pronunciation. A...

متن کامل

Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition

Non-native speech differs significantly from native speech, often resulting in a degradation of the performance of automatic speech recognition (ASR). Hand-crafted pronunciation lexicons used in standard ASR systems generally fail to cover non-native pronunciations, and design of new ones by linguistic experts is time consuming and costly. In this work, we propose acoustic data-driven iterative...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002